ERROR ANALYSIS OF COARSE-GRAINED KINETIC MONTE CARLO METHOD 
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Abstract. The coarse-grained Monte Carlo (CGMC) algorithm was originally proposed in the series of works 
Hj||2l]|24|. I" this paper we further investigate the approximation properties of the coarse-graining procedure and 
provide both analytical and numerical evidence that the hierarchy of the coarse models is built in a systematic way 
that allows for error control in both transient and long-time simulations. We demonstrate that the numerical accu- 
racy of the CGMC algorithm as an approximation of stochastic lattice spin flip dynamics is of order two in terms 
of the coarse-graining ratio and that the natural small parameter is the coarse-graining ratio over the range of parti- 
cle/particle interactions. The error estimate is shown to hold in the weak convergence sense. We employ the derived 
analytical results to guide CGMC algorithms and we demonstrate a CPU speed-up in demanding computational 
regimes that involve nucleation, phase transitions and metastability. 
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1. Introduction. Microscopic computational models for complex systems such as Mo- 
lecular Dynamics (MD) and Monte Carlo (MC) algorithms are typically formulated in terms 
of simple rules describing interactions between individual particles or spin variables. The 
large number of variables and even larger number of interactions between them present the 
principal limitation for efficient simulations. Another restricting factor is illustrated by essen- 
tially sequential nature of approximating the time evolution in particle systems that yields a 
substantial slowdown in the resolution of dynamics, especially in metastable regimes. 

In I20l l21ll24l the authors started developing systematic mathematical strategies for the 
coarse-graining of microscopic models, focusing on the paradigm of stochastic lattice dynam- 
ics and the corresponding MC simulators. In principle, coarse-grained models are expected to 
have fewer observables than the original microscopic system making them computationally 
more efficient than the direct numerical simulations. In these papers a hierarchy of coarse- 
grained stochastic models - referred to as coarse-grained MC (CGMC) - was derived from the 
microscopic rules through a stochastic closure argument. The CGMC hierarchy is reminis- 
cent of Multi-Resolution Analysis approaches to the discretization of operators [4|, spanning 
length/time scales from the microscopic to the mesoscopic. The resulting stochastic coarse- 
grained processes involve Markovian birth-death and generalised exclusion processes and 
their combinations, and as demonstrated in [20 21 24 1, they share the same ergodic proper- 
ties with their microscopic counterparts. The full hierarchy of the coarse-grained stochastic 
dynamics satisfies detailed balance relations and as a result not only it yields self-consistent 
random fluctuation mechanism, but also consistent with the underlying microscopic fluctua- 
tions and the unresolved degrees of freedom. From the computational complexity perspective, 
a comparison of CGMC with conventional MC methods for the same real time shows, |20|, 
that the CPU time can decrease approximately as 0(l/q 2 ) or faster, where q is the level of 
coarse-graining, as demonstrated for spin-flip lattice dynamics. Thus, while for macroscopic 
size systems in the millimeter length scale or larger, microscopic MC simulations are im- 
practical on a single processor, the computational savings of CGMC make it a suitable tool 

'Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003-9305, USA, 
markos@math . umass . edu 

T Mathematics Institute, The University of Warwick, Coventry, CV4 7AL, United Kingdom, 
pi echac@ maths . Warwick .ac.uk 

*Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003-9305, USA, 
sopas@math . umass . edu 

1 



2 



M. A. Katsoulakis, P. Plechac, A. Sopasakis 



capable of capturing large scale features, while retaining microscopic information on inter- 
molecular forces and particle fluctuations. In the case of diffusion (spin exchange) dynamics 
we also observe an additional coarse-graining in time by a factor q 2 , improving the hydrody- 
namic slowdown effect in the conservative MC, |24|. 

In the recent paper 1 23 1 the authors rigorously analysed CGMC models as approxima- 
tions of conventional MC in non-equilibrium, by estimating the information loss between 
microscopic and coarse-grained adsorption/desorption lattice dynamics. In analogy to the 
numerical analysis for PDEs, an error analysis was carried out between the exact microscopic 
process {cr t }t>o an d the approximating coarse-grained process {r] t }t>o- The key step in 
this direction was to use, as a quantitative measure for the loss of information in the coarse- 
graining from finer to coarser scales, the information-theoretic concept of the relative entropy 
between probability measures, |8|. Such relative entropy estimates give a first mathematical 
reasoning for the parameter regimes, i.e., the degree of coarse-graining versus the interac- 
tion range, for which CGMC is expected to give errors within a given tolerance. Using the 
rigorous results in 1 23 1 as a starting point, in this paper we focus on carrying out a detailed nu- 
merical analysis of the error propagation for spin flip lattice dynamics. Due to the numerical 
intractability of the relative entropy for a large particle system, we employ, in the numerical 
error calculations, suitable computable upper and lower bounds, as well as targeted coarse 
observables. The latter point of view necessitates in the use of a weak convergence framework 
for the study of the error between CGMC and direct numerical simulations of the stochastic 
lattice dynamics. We demonstrate that the numerical accuracy of the CGMC algorithm is of 
order two in terms of the ratio of the coarse-graining over the range of particle/particle interac- 
tions. We also refer to recent work in |22 1 on weak error estimates between microscopic MC 
algorithms and therein derived SDE approximations. Further details about a priori estimates 
for weak convergence of approximations to SDEs can be found in 1 3 33 1 and 1 27 1 . Related a 
posteriori estimates are discussed in 1 32 1. We further employ the derived analytical results to 
guide CGMC algorithms and we demonstrate a CPU speed-up in demanding computational 
regimes that involve nucleation, phase transitions and metastability. We demonstrate com- 
putationally that CGMC probes efficiently the energy landscape, yielding spatial path-wise 
agreement with the underlying microscopic lattice dynamics, at least for fairly long but still 
finite interactions. 

The CGMC algorithms discussed here are related to a number of methods involving 
coarse-graining at various levels, for instance fast summation techniques, computational re- 
normalization and simulation and multi-scale computational methods for stochastic systems. 
One of the sources of the computational complexity of molecular simulations arises in the 
calculation of particle/particle interactions, especially in the case where long range forces are 
relevant. The evaluation cost of such pairwise interactions can be significantly reduced by 
applying well-controlled approximation schemes and/or a hierarchical decomposition of the 
computation. Such ideas have been successfully applied in the development of Ewald sum- 
mation techniques, multigrid (MG) , fast multipole methods (FMP) or tree-code algorithms. 
Typically, once the interaction terms are computed with one of these fast summation methods, 
they are entered in the microscopic algorithm where a simulation with a large number of in- 
dividually tracked particles has still to be carried out. The point of view adopted by CGMC is 
related to these methods in the sense that the interaction potential or operator is approximated 
in terms of a truncated multi-resolution decomposition within a given tolerance. The CGMC 
is subsequently defined at the coarse level specified by the truncation of the decomposition. 
However, a notable difference is that CGMC models track much fewer coarse observables in- 
stead of simulating every individual particle. The equilibrium set-up of CGMC is essentially 
given by the renormalised Hamiltonian after a single iteration in the renormalisation group 
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flow. It is not surprising that such an approach, when applied to near critical temperature 
simulations, has many limitations. For example, in the nearest-neighbour Ising-type models 
this fact is manifested in the aforementioned error estimates and the comparative simulations 
in 1 20 1 . On the other hand the focus of CGMC is dynamic simulations usually coupled to a 
macroscopic system (see for instance the hybrid systems in I34l ll9l ). where criticality may 
not be as important due to the presence of a time-varying external field. Nevertheless, further 
corrections to the CGMC dynamics from the renormalisation group flow given by RGMC and 
multigrid MC methods 1121 can improve the order of convergence of the CGMC. We 
refer to 1 18 1 for higher order accurate CGMC methods based on cluster expansions, where 
the coarse-graining procedure described here is the model around which a cluster expansion 
is carried out with controlled errors. In that sense the CGMC method is of order two accurate 
as explained in Section 4. 

In recent years there has been a growing interest in developing and analysing coarse- 
graining methods for the purpose of modelling and simulation across scales. Such systems 
arise in a broad spectrum of scientific disciplines ranging from materials science to macro- 
molecular dynamics, to epidemiology and to atmosphere/ocean science. Various coarse- 
graining approaches may yield explicitly derived stochastic coarse models using different 
coarse approximations, e.g., H 1 31 1151 IT6l 1291 l3TI . or can be statistics-based [ 30 1 or may 
rely on on-fly simulations, e.g., the equation-free method 1 25 1, the heterogeneous multi-scale 
method 1 10], or multi-scale FE methods 1 14|. A systematic approach to upscaling of stochas- 
tic systems has been also developed from the multi-level perspective in 01 El ID, where the 
authors proposed algorithms for efficient multi-scale simulations using Monte Carlo meth- 
ods. Other coarse-graining techniques in the polymer science literature include the bond 
fluctuation model and its variants |28|. Such coarse-graining methodologies often rely on 
parametrisation, hence at different conditions (e.g., temperature, density, composition) coarse 
potentials need to be re-parametrised 1301 . 

2. Microscopic lattice models. The presented analysis applies to the class of Ising- 
type lattice systems. For the sake of simplicity we assume that the computational domain 
is defined as the discrete periodic lattice A^r = -Z d n T which represents discretion of the 
d-dimensional torus T = [0, l) d and d denotes the spatial dimension. We restrict presentation 
of the results to d = 1, nevertheless higher dimensional cases are obtained without significant 
changes. However, the algorithms can also be implemented on bounded domains with usual 
boundary conditions. The number of lattice sites N = n d is fixed. The microscopic degrees 
of freedom or the microscopic order parameter is given by the spin-like variable a(x) defined 
at each site x 6 A at. In this paper we discuss only the case of discrete spin variables, 
i.e., a{x) £ E with E = {-1, 1}, E = {0, 1} (Ising model) or E = {0, 1, . . . s} (Potts 
models). The case of the spin variable belonging to a compact Riemannien manifold, e.g., 
E = S 2 (Heisenberg model), E = SU(2) (matrix model), will be studied elsewhere. We 
denote by a = {u{x) \ x e A^r} a configuration of spins on the lattice, i.e., an element of the 
configuration space Sn — E A]V . The interactions between spins at a given configuration fl- 
are defined by the microscopic Hamiltonian 

H(a) = -\ E E J ( x ~ vM x Mv) + E . (2-D 

where h(x) denotes the external field at the site x. The two-body inter-particle potential J 
accounts for interactions between individual spins. We consider the class of potentials with 
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the following properties 

J(x-y) = jjV(j\x-y\), x,yeA N , (2.2) 
V:R->M, V(r) = V(-r), V(r)=Q, if|r|>l. (2.3) 

We impose additional assumptions on V which allow us to derive explicit error estimates: 

V is smooth on K \ {0}, (2.4) 

/ \V{r)\ dr < oo , and / \d r V(r)\ dr < oo . (2.5) 

Note that the summability condition for V guarantees that the potential J is also summable 
due to the scaling factor. Hence the Hamiltonian is well defined even for N, L — > oo. The 
canonical equilibrium state is given in terms of the Gibbs measure 

wAfo) = ^—e- 0H ^P N (da) , Z N ,fi= [ e~^ H ^P N {da), (2.6) 

Z N,fl Js N 

where P^(da) — YixeA N P(d<r(x)) is the product measure on Sn and the spins a{x) are 
independent identically distributed (i.i.d.) random variables with the common distribution 
p. For example, in the Ising model the prior distribution on £ = {0, 1} would typically be 
p(0) = p(l) = 1/2. 

The microscopic dynamics is defined as a continuous-time jump Markov process that 
defines a change of the spin cr(x) with the probability c(x, <r;£)Ai over the time interval 
[t,t + At]. The function c : A^v x 5jv x S — > K is called a rate of the process. The 
jump process {at}t>o is constructed in the following way: suppose that at the time t the 
configuration is a t , then the probability of changing the spin at the site x E Ajy spontaneously 
from at(x) to a new value (eS over the time interval [t, t + At] is c(x, a; £)At + 0(At 2 ). 
We denote the resulting configuration by a x ^. In the case of the Ising-type state space and 
spin-flip dynamics we omit £ in this notation. The generator C : L oc {Sm) — > L°°{Sn) of 
the Markov process acting on a bounded test function <\> € L°°(Sn) defined on the space of 
configurations is given by 

(£0)(a) = V / c(x, a; £) (#o*' e ) - 0(a)) ^ . (2.7) 



The evolution of an observable (a test function) </> is given by 

^E[0(a t )] =E [£<t>{<T t )], (2.8) 

where the expectation operator E [.] is with respect to a measure conditioned to the initial 
configuration cr t= o = <Jo- We require that the dynamics is of relaxation type such that the 
invariant measure of this Markov process is the Gibbs measure ( 12. 6> . The sufficient condition 
is known as Detailed Balance (DB) and it imposes condition on the form of the rate 

c(x, a; ^ 0H{a) = c(x, a x ' L , a(x)) e -^ H{aSC '^ . (2.9) 

This condition has a simple interpretation: c(x, cr;^) is the rate of converting cr{x) to the 
value £ while c(x, a x, ^\ a(x)) is the rate of changing the spin with the value £ at the site x 
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back to a(x). The widely used class of Metropolis-type dynamics satisfies ( 12. 9> and has the 
rate given by 

c(x, g- = G((3A x ^H(a)) , where A x ^H(a) = H{o x *) - H{a), (2.10) 

where G is a continuous function satisfying: G(r) = G(—r)e~ r for all r G K. The most 
common choices in physics simulations are G(r) — jj-f (Glauber dynamics), G(r) = 
e~M + , (Metropolis dynamics), with [r] + = r if r > and = otherwise, or G(r) = e _r / 2 . 
Such dynamics are often used as samplers from the canonical equilibrium Gibbs measure. 
However, the kinetic Monte Carlo method is also used for simulations of non-equilibrium 
processes. The dynamics in such a case is known as Arrhenius dynamics, whose rates are 
usually derived from transition state theory or obtained from molecular dynamics simulations. 

To avoid unnecessary generality we restrict the description to the Ising-type model with 
£ = {0, 1} used for modelling adsorption/desorption processes. We also omit £ in the nota- 
tion. The Arrhenius rate is defined as follows 

c(ra\-{ d ° if cr(as) = 0, 



where 



U(x,a)= J(x - y)cr(y) - h(x) . (2.12) 



Furthermore, the spin-flip rule is given by 



1 — cr(x) if y — x 

With the introduced notation the coarse-graining algorithm can be described as an approxi- 
mation of the microscopic dynamics, i.e., of the process {at}t>o by a coarse-grained process 
{Vt}t>o where the approximation is done in a controlled way. We are interested not only in 
the approximation of the invariant measure /ijv,/9 (da) (see i2.6\ ) but also in the approximation 
of the measure on the path space. 

3. Approximation of the coarse-grained process. The coarse-graining is defined in a 
geometric way by introducing the coarse-grained observables as block-spin variables. This 
approach follows the standard procedure of real-space renormalisation, see for example 1171 . 
We remark that although we introduce block-spins our aim is not to approximate the renor- 
malisation group flow (either on the space of Gibbs measures or on the path space) rather 
we want to find an approximation that is constructed with low computational cost and with 
controlled and computable error estimates. 

In general terms we define the coarse-graining operator T : Sn — * S^ f , where the 
coarse configuration space is defined on the coarse lattice K C M , and with the new state 

space S c , i.e., S 1 ^ q = (£ c ) Aju . The coarse configuration rj = Tcr 6 is defined on a 
smaller lattice with M lattice sites and with the coarse state space E c for the new lattice spins 
rj(k). The parameter q defines the coarse-graining ratio. The operator T induces an operator 
T* on the space of probability measures 

T* : V(S N ) -> V{S c M>q ) , p(a) » //(t?) := fi{a e S N \Ta = 77} . 

Ising-type spins. To be more specific we analyse the following case of Ising spin-flip dynam- 
ics Sn = {0, 1} Ajv . Each coarse lattice site k G A C M represents a cube Cu that contains 
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q sites of the microscopic lattice A^r. The projection operator defines the block spin at the 
coarse site k 



(Ta)(k) := £ 



(3.1) 



xec k 



If the dimension d of the lattice is greater than one we understand k and x as multi-indices 
k = (fci, . . . , kd) and we index the corresponding lattice sites in the natural order. Choosing 
the projection operator in this way defines the coarse state space as S c = {0, 1, . . . q}. Given 
the Markov process ({a t } t >o, C) with the generator C we obtain a coarse-grained process 
{Ta t }t>o which is not, in general, a Markov process. From the computational point of view 
this may cause significant difficulties should sampling of such a process be implemented on 
the computer. Therefore we derive an approximating Markov process ({f]t}t>o, &) which 
can be easily implemented once its generator is given explicitly. 

For the model Ising system the projected generator of the coarse-grained process {r/t }t>o 
can be evaluated explicitly by rearranging the summations on the lattice Ajy. Given the 
microscopic state a and corresponding coarse state rj = Ta 



E 



.x£C, 



E C ( :C ' ')( 1 - a ( X )) 



^2 c(x,a)a(x) 

k£A a , f Lx£C k 



ip(r])} . 



(3.2) 



The configuration 4 defined on the coarse state space is equal to zero at all sites except the 
site k G K C M where it is equal 1, i.e., Sk(j) — 1 for j — k and = otherwise. We see from 
the formula (13. 2> that the exact generator for the coarse process can be written in the form 



£ c i>(v) = E c -( fc ) + &) - Mi)] + E Cd ( fc ) Mi - **) - M)] . 0.3) 

fceA^ fceA^ 
where the new rates 

Ca(k) = ^2 c(x,a)(l - a(x)) , c d (k) = ^ c(x,a)a(x), (3.4) 

xec k xec fc 

correspond to the adsorption and desorption processes. In this form the rates depend on 
the microscopic configuration a and not on the coarse random variable Ter. Therefore, it 
is reasonable to propose an approximating Markov process, which for the case of desorp- 
tion/adsorption is a birth-death process {i] t } t >o defined on the state space E c = {0, 1, . . . q}. 
This process is defined by the generator C c of the form (13.31 where the rates c a and Cd are 
replaced by approximate rates 

c a (k,r 1 )=d (q-r l (k)) ) c d (k, v ) = d Q r){k)e- 0t} {^) . ( 3 .5) 

For details we refer to [20 1 . The new rates have a simple interpretation in terms of fluctuations 
on each cell: c a (k, rj) describes the rate with which the coarse variable r)(k) is increased by 
one (i.e., adsorption of a single particle in the coarse cell Cfe) and cy(fc, rj) defines the rate 
with which it is decreased by one (desorption in Cfc). The new interaction potential U(rf) 
represents the approximation of the original interaction U(a). 
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DEFINITION 3.1. We define the approximation U(k, rj) of the potential U{x, a), \2.12\ , 
at the coarse level 

U(k, rj) = J( k > Q»?(0 + -A - )0?O) - 1) - h(k) . (3.6) 
ieA c M 

The coarse-grained interaction potential J is computed as the average of the pair-wise inter- 
actions between microscopic spins between the coarse cells Ck and Ci 

J(fc, I) = — J(x ~ y) , for all k, I G K C M , such that k ^ I, and (3.7) 

9 x<£C k yGCi 

j(k, k) = j(o, o) = -A— E J ( x - y) ■ w 

q[q > xec k y ec k 

y^x 

The error estimate for the projection follows directly from the assumptions on the regularity 
of J (or V) (I2.4t — (12.5b and the Taylor expansion of the potential J. We state it as a separate 
lemma. 

LEMMA 3.2. Assume that J satisfies &2.4l - X2~5\) then the coarse-grained interaction 
potential J at the coarse-graining level q approximates the potential J with the error 

\J(x-y)-J(k,l)\<^c d sup ||VK(ar'-» / )||<0(^) (3.9) 

y'ed 

\J( x -y)-J(0,Q)\<i Cd sup \\VV(x' -y')\\<o(-^) , (3.10) 
L x', y 'ec k KL ~> 
y'^x' 

where c d = maxfeeA^ {diam (C fc )}. 

Proof: Using the properties of the potential V, we expand V into the Taylor series, 

V(z) = V(z') + (z- z').VV{z') + 0(\\z - z'\\ 2 ) . 

Using the definition of J, (12. 2> and setting z = x — y and z' = x' — y', where x, x' G Ck and 
y, y 1 G Ci, we have 



J(x-y) = - 2 E E J^'-v') 



q x'ec k y'ec t 

E E ((* - y) - ( x ' - y'))-w(x' y r ) 

x'£C k y'£Ci 

+ Z3 E E 0(\\(x~y)-(x'-y'W), 

x'£C k y'eCi 

and using the estimate \\(x — y) — (x' — y')\\ < \\x — x'\\ + \\y — y'\\ < maxjdiam (Ck)} 
we obtain ( 13.91 in the case k ^ I and similarly for k = I. 

From Lemma l3~2l we derive the error bound for the approximation of the coarse-grained 
potential U. Note that in the definition of U the principle contribution to the summation 
involves interactions within the interaction range L and thus we have the following estimate. 
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COROLLARY 3.3. The microscopic potential U(x,cr) is approximated by U (k, rj), with 
the error 

A q>N (U,U) = \U(k,Ta)-U(x,a)\=o(j^ , forallxeC k . (3.11) 



Note that this approximation represents the direct projection of the interaction kernel J 
on the coarse space and the contribution from fine scales are neglected. This procedure differs 
from the renormalisation group approach where fluctuations from the fine scales contribute 
to the transformed Hamiltonian. However, in the case of finite-range interaction kernels J, 
treated here, the above projection yields approximation of the order 0(q/L) 2 as we discuss 
in the next section. The coarse interaction Hamiltonian is then given explicitly in terms of J 
and h as 

H{r,) = -\ J2 £J(MM*M0-^(o,o) v(i)(v(i)-i)+ E 

(3.12) 

A direct calculation shows that the invariant measure of the Markov process {rjt]t>o 
generated by C c is again a canonical Gibbs measure 

where the product measure ¥ M,q{drj) is the coarse-grained prior distribution. Note that the 
prior distribution is altered by coarse-graining procedure and different projection operators T 
may yield prior distributions that are computationally intractable. 

For example, the coarse-grained prior arising from the uniform microscopic prior (p(0) = 
p(l) = 1/2) is the binomial distribution corresponding to q independent sites: 



PM, q (dv)= J] P$(«*7(*))> P c q (v(k)=p) = 



?! flV 



The condition of detailed balance for {rj t }t>o with respect to the measure \iM,qB is 
Ca{k,r])^M,q,i3{r]) = Cd(k,rj + Sk)fJ>M,q,0(rj + ^k) , 

We only verify the first relation, while the second identity is checked in analogous way. Using 
that H (rj + 5k) — H(rf) = —U{k) and the definitions of the rates (13. 5> . we have (assuming 
without loss of generality, do = 1) : 

C a (k, ri)HM,q,p{rj) ~ C d (k, T) + dk)fJ,M,q,l3(V + fa) = 

(q - V (k))e-^F M ,q(v) - (V(k) + l)e~^ +s ^ +0 ^)p M , q (v + h) = 
e -f>m {{q - v (k))P M ,M - (V(k) + 1)Pm,,(>) + S k )} = 

m 

II 1® ^ - ^mvik) - (v(k) + l)fa(k) + 1)} • 

Since (q — p)p q (p) = (p + l)p q (p + 1), for all integers < p < q, the last curly bracket is 
equal to zero, hence the detailed balance holds. This calculation shows that due to the specific 
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form of the self-interaction term rj(l)(r](l) — 1) the detailed balance condition is satisfied for 
the coarse Hamiltonian ( 13.121 1 and hence the fluctuations from microscopic dynamics are 
properly included into the coarse-grained process. The coarse-graining procedure described 
here satisfies basic criteria imposed on an approximating process: 

(i) error control on a finite-time interval [0, T], In particular, the derived coarse-grained 

stochastic process {rjt}t>a approximates a pre-specified observable on a finite-time 
interval [0, T], e.g., i3.l\ . In particular, time-dependent error estimates such as 
( 15. 21 can rigorously demonstrate that the process {r)t }t>o keeps track of fluctuations 
from the microscopic level. Consequently expected values of certain path dependent 
(global) quantities can be properly estimated. We characterise approximation prop- 
erties of {Ta t }t>o by {rit}t>o using a suitable probability metric on the path space. 

(ii) approximation of the invariant (equilibrium) measure. The invariant measure \i c M ^ (drj) 

for the process {r) t }t>o defined on is close, in a suitable probability metric, to 
the projection of the microscopic measure T»(/xjv./3(df)). In particular the error 
estimates in (15 . 11 below demonstrate that the coarse-grained process can preserve 
the ergodicity properties of the microscopic process within a prescribed tolerance. 
We also note that the coarse-graining modifies the microscopic prior Pat (At) in 
(12.61 . yielding the coarse prior PM,q(dr]). 
If the approximating process follows the basic principles (i) and (ii) we observe as a result of 
the error estimates presented here and in [23 1, that both the transient, as well as the long time 
dynamics are expected to be captured accurately by the coarse-graining. Although this is not 
a complete proof of a controlled error for infinite time, it constitutes a first rigorous step in 
this direction. The approximation properties are also supported by the numerics presented 
here and in the references. 

4. Probability metrics and information theory tools. Since we propose the coarse- 
grained process {r]t}t>o to be only an approximation of {Tat}t>o which can be computed 
in a fast and simple way it is necessary to define in what sense we evaluate the approxi- 
mation properties. We view the approximation in coarse-graining procedure as information 
loss. Such approach is naturally connected to the actual computational implementation in 
the Monte Carlo algorithm. In this section we give a brief introduction to basic tools of in- 
formation theory required in the error analysis. We define the basic notions on a probability 
space with the countable state space S but analogous properties and definitions hold for the 
relative entropy of measures on general probability spaces (see [9 1). Although the exposition 
in this section is general we keep the notation consistent with the previous section. However, 
the reader may assume that the state space S does not necessarily refer to the space of spin 
configurations. 

We consider two probability measures -k\ (a) and TT2 (<t) on the countable state space S, 
and we define the relative entropy 



Using Jensen's inequality it is not difficult to show that 
TI(tti |tt 2 ) > and, 

1Z (tti I 772) = if and only if ni(a) = ^(er) for all a 6 S. 

Although the above properties of the relative entropy 1Z (ni \ 112) suggest that this quantity 
is a distance between the measures -k\ and TT2, it does not define a true metric since it is 
not symmetric, i.e., lZ(iri | 7^) ^ 7Z(tt2 | tt\) for all measures %%, TT2- Nevertheless, there 




(4.1) 
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is an important inequality that allows us to use the relative entropy as a tool for estimating 
distance between two measures and hence use it for evaluating errors in the coarse-graining 
procedures. Using the relative entropy we can bound the total variation of the measures 7Ti 
and 7T2 : 

n{in\it 2 )>\ fe|7ri(<r)- 7 r 2 (<7)| N ) = -Iki-TrallTv, (4-2) 

\<J65 / 

and hence for any observable <f> = 4>(a) we have the bound 



IEtt! [0(a)] -E ff2 [<l>(a)]\ < sup|0(a)|V2^ (tti \tt 2 ) . (4.3) 

The following variational characterisation of the relative entropy is useful in the error 
estimation. Given a bounded function (observable) G L°°(S) defined on the state space S 
we have the natural dual pairing with the measures on S 

(7T,$ =£V(trM<7) [4>\. 

The relative entropy (14. 1 1 has the variational representation (see |26 pp. 338-339]) 

n{m\7r 2 )= sup {(tti,0) -log(7r 2 ,e^}} . (4.4) 

The variational representation is used in the next section to obtain lower bounds on the relative 
entropy error of coarse-grained processes. 

It is worth mentioning the relation between coarse graining, information theory and ap- 
plication of the relative entropy in the context of coarse graining. The information point of 
view also clearly explains the meaning of the relative entropy as a tool that estimates the 
loss of information. In information theory one is interested in encoding the random vari- 
able a with values in the state space S, and distributed according to the probability measure 
7r = 7r(cr), a € S. The information should be encoded using symbols from a Z?-nary al- 
phabet, for example only and 1 in the case of the binary alphabet. Suppose that Co (c) 
is a code/string corresponding to the value a G S. We denote ^d(ct) the length of the code 
needed for the state a. Since the information is carried in the random variable a we have to 
ask what is the expected length of the code required to capture the states of a provided we 
know the distribution of a. The expected length is given by 

MM^)] = £ 7r (*)M°-)- ( 4 - 5 ) 

It can be shown (see [8 1) that the optimal (minimal) expected length is attained by choosing 

£ D (a) = log D " ^ 

Obviously, to set the optimal length for encoding the states of the random variable a one 
needs to know the measure ir. If we assume a wrong distribution u> = oj(er) to define the 
length of the code we obtain the expected length which would not be optimal. The relative 
entropy 1Z (tt | u) describes the increase of the length i4.6i due to using the wrong distribution 
for the random variable a. In this sense 1Z (tt | ui) is interpreted as the increase in descriptive 
complexity due to "wrong information". 
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This information point of view is applicable to the analysis of coarse-graining proce- 
dures: the spin configurations a are sampled by the Markov chain Monte Carlo algorithms 
and hence samples of a random variable a with large-dimensional state space are generated. 
On the coarse level we sample an approximate process {r] t }t>o instead of the exact projec- 
tion {Tat}t>o and thus assuming a wrong measure/distribution for the random variable a. 
Using the relative entropy for evaluating the approximation properties we estimate the loss 
of information arising from using samples of {ijt}t>o instead of the exact coarse-grained 
process. 

5. Error analysis and a priori estimates for coarse-grained processes. As described 
in the previous section we construct a new process which only approximates the projected 
process {Tcr f } f > . The approximation properties of such construction are quantified in this 
section. 

We do not attempt to capture the effect of fine scales exactly and incorporate them into 
the coarse model through the renormalisation group transformation. Instead we construct an 
approximate process {r] t }t>o, with the invariant measure fi c M (3 . The first question which 
needs to be addressed is comparison and an error estimate for the exactly coarse-grained 
equilibrium measure, i.e., T*///v„a> and its approximation fi c M /3 . We recall that T* is the 
projection operator induced by the fine-to-coarse projection of spin variables. 

5.1. Information theory estimates. The principal idea proposed in 1 24 1 is to control the 
specific loss of information quantified by the relative entropy TZ (ji c M q p | T*/xjv j( g^ between 
the coarse-grain equilibrium measure [i c M p and the projected equilibrium measure T*p,N t p 
of the microscopic process. 

Proposition 5.1 ( 1 24 1 , A priori estimate). 



This a priori estimate quantifies the dependence of the information distance, the specific 
relative entropy 7Z (p \ v), in terms of the coarse-graining ratio q and the interaction range L. 

The procedure described in the previous section defines a hierarchy of coarse-grained 
algorithms parametrised by q. The fully resolved simulations correspond to the microscopic 
model q = 1 while the mean-field approximation is obtained in the case where q > L, i.e., 
when we coarse-grained beyond the interaction range of the potential. Each level of this 
hierarchy introduces an error since some fine-scale fluctuations are neglected. 

For the comparison of the processes {Ta t }t>o an d {vt}t>o we need to carry out a similar 
a priori analysis on the coarse path space T>(S%j q ), i.e., on the space of all right-continuous 
paths rjt : [0, oo) — > Sj^ q . Above we have presented estimates for the exact coarse graining 
T*/J'N,i3 of the invariant measure pn,p an d its approximation p c M p computed in terms of 
the coarse Hamiltonian. In a similar way we treat the measures on the path space: we de- 
note <9<7 0i [o,t] me measure on T>(Sn) for the process on the interval [0, T], {ct}t<=[o,T] with 
the initial distribution cfq. Similarly Q^ g j T j denotes the measure on the coarse path space 
T>(S%j q ). With a slight abuse of notation we also use T*Q to denote the projection of the 
measure Q on the coarse path space, i.e., the exact coarsening of the measure Q. The fully 
rigorous analysis on the path space is more involved and we refer to |23|. For the sake of 
completeness we only state the main a priori estimate. 
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PROPOSITION 5.2 (|23|). Suppose the process {r]t}te[o,T]> given by the coarse genera- 
tor C c , defines the coarse approximation of the microscopic process {&t}t£[o,T] then for any 
q < L and N, Mq = N, the information loss as qj L — » is 

±n (t*q t , cto , [0 , t] i q;, [0 ,t]) =ro(i) (5.2) 

Remark: The detailed proof of this information estimate (see 1 23 1) reveals that no control of 
fluctuations of the process {at}t>o is necessary for the estimate. Consequently the estimate 
is very robust and, as long as q/L is small, the approximation by the coarse-graining scheme 
yields a small error independent of the potential V or the initial distribution <tq. Although 
the estimate is for finite times [0, T] only, and grows with T, in many cases the system nu- 
cleates a new phase at the initial stage of its evolution and thus the estimate ensures good 
approximation of the nucleation phase. 

It is worth noticing that the relative entropy estimate clearly demonstrate limitations of 
the coarse-graining method since it gives the error of order one for short-range interactions 
(the nearest neighbour interaction corresponds to L = 1). On the other hand the analysis us- 
ing the relative entropy (information) distance identifies the small parameter in the asymptotic 
expansion of the blocking error, namely the ratio q/L. 

The next estimate provides a lower bound for the loss of information in terms of coarser 
observables: 

PROPOSITION 5.3 (Lower bound). Suppose the process ({T)t}t€.[o,T]t L c ), defined by the 
coarse -graining operator T with coarse -graining parameters Mq = N, is the coarse ap- 
proximation of the microscopic process {&t}t£[a,T\- Let T M ,q be another coarse-graining 
operator, such that M' < M, M'q' = Mq = N. Then the following estimate for the 
invariant microscopic measure Pn,0 an d the coarse approximation p c M a holds 

K (p c w | T*p N>0 ) > K (Tf '><VM, g ,/3 1 Tf' ^'/xjv./s) • (5.3) 
Moreover, on any finite-time interval [0, T] 

n (T*Q TCTO , [0 , T] I Q^o.t]) > n (Tf '«'Q Tao , l0 , T] | Tf' ''Q^o^) . (5.4) 
PROOF: We first recall the variational formulation for the relative entropy 

1Z (fi | v) — sup <yj f dp, — log J duj , (5.5) 

where the supremum is over all bounded functions in the space where the measures are de- 
fined. This inequality now readily implies the result since 

K | v) > sup jy" / o T dp, - log J e foT dz/ j = TZ (T*/i | T*i/) (5.6) 

where T is the projection operator (super-scripts omitted) in the statement of the proposition. 

Remark: This estimate provides a lower bound for the loss of information in terms of 
coarser observables, hence the condition M' < M where M'q' — Mq = N. For instance 
if M' — l,q' — N the measures T* f ' 9 p c M ^ and T» f ' 9 pn,p are the PDFs of the total 
coverage with respect to the coarse-grained (essentially mean field with a noise) and the 
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microscopic Gibbs states respectively. We characterise such an estimate as a priori since the 
bound depends on the exact microscopic process, in analogy to bounds for approximations to 
PDEs which depend on the Sobolev norm of the exact solution, II II . At first glance it may 
appear that such an estimate is hard to implement since it depends on the exact microscopic 
MC. However, for relatively small systems where microscopic MC can be carried out, the 
bound (15. 3> can provide a lower bound on the loss of information, as well as a sense on how 
sharp are the upper bounds given by a posteriori estimates. More specifically when M' is 
small , i.e., M' = 1,2,3... etc., the PDFs can be calculated as a histogram by MC and 
subsequently the relative entropy in the lower bound is straightforward to compute. 

5.2. Microscopic reconstruction and weak convergence estimates. In many prac- 
tical MC simulations the main goal is to estimate averages (expected values) of specific 
observables. Therefore it is natural to analyze the weak approximation properties of the 
coarse-graining procedure. The weak error is defined as the quantity e w = |Eg [ip(Ta t )] — 
Eg [ip(rjt)] |, where the expectation Es [•] is defined for the path conditioned on the initial con- 
figuration r/Q = Tuo = S. Alternatively we can compare the microscopic process {at}t>o 
with its synthetic process {7t}*>o which is reconstructed from the coarse process {rjt}t>o- 
The weak error is then defined as e w = \&s [0(ct)]— E5 [</>(7t)]|> where the expectation E5 [■] 
is now defined for the path conditioned on the initial configuration ctq = S. Here and in what 
follows <p denotes a test function (observable) on the fine level while ip is used for a test func- 
tion on the coarse level. Theorem l5.8l and Corollary 15. 9l quantifv the rate of convergence for 
the weak error on both levels as q/L — > 0. We refer to |22| for error estimates in the weak 
topology between microscopic MC algorithms and therein derived SDE approximations. 

Before we formulate the proposition and proceed with the proof it is worth clarifying 
the difficulty of comparing the projected process {Tu t } t >o with the approximating process 
{Vt}t>o- The projection Tcr f of the microscopic process on the coarse grid does not nec- 
essarily define a Markov process. On the other hand the approximating process {77* }t>o is 
constructed as a Markov process ({r]t}t>o, C c ) with the generator C c defined by (13. 5> . To 
circumvent the technical difficulty the authors in [23 1 suggested to construct an auxiliary pro- 
cess {7t}t>o as an intermediate step in the estimation of the relative entropy between the 
processes {<7t}t>o and {r)t}t>o- We adopt the same strategy in order to make comparison be- 
tween observables which depend on Markovian processes {o~t}t>o and {"ft}t>o- The process 
{jt}t>o can be directly reconstructed from the coarse-grained process {rjt}t>o- Thus we are 
lead to the definition of the synthetic microscopic (Markov) process {7t}t>o associated with 
the process {cr t }t> - 

DEFINITION 5.4 (Synthetic microscopic process). The auxiliary process {7f}t>o is 
defined on the microscopic configuration space Sn by the generator C 1 : L°° (Sn) — > K 



where the rate function c 7 (x, a) is defined in terms of the coarse-grained interaction potential 



constant interpolation is used to extend the function £/(., .) from the coarse lattice to the 
fine lattice. We denote k{x) to be the cell index of the cell to which the site x belongs, i.e., 

x e C fe ( x ). 

The properties of { , ft}t>o were studied in [23 1 and it was proved that: 
(i) the coarse-grained projection {T7 t } t >o of the Markov process ({7t}t>o, £ 7 ) is still a 




(5.7) 
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(ii) the processes {T7 t } t > and {774 } t > have the same transition rates. Hence, whenever the 

processes have the same initial distribution they induce the same probability measure 
on the coarse-grained path space T>(S1 { q ). If we define Q^ o (j],t) and Q l0 (j,t) to 
be the probability measures of the Markov processes {?7 t } t >o and {7t}t>o respec- 
tively (conditioned on the initial condition r)o = Tjo), then for all t > we have 
the projection 

Q£ (^)=T*Q 70 ( 7 ,*)= #70(7,*), 

{ 7 |T 7 =r )t } 

provided this relation is satisfied at t — 0. Hence this property allows us to compare 
the processes in a path-wise way. 

(iii) the microscopic process {7t}t>o can be reconstructed from the approximating coarse 

process {f?t}t>o- Such reconstruction is an inverse procedure to the projection from 
fine to coarse configuration space. In such a way we can compare the original micro- 
scopic process with the approximation on the coarse configuration space. A simple 
choice of a reconstruction operator is to distribute spins 74 (x) for x G Ck uniformly 
sothatT 7t | Cfc = r) t {k). 
Remark: It is conceivable that the synthetic process {jt}t>o can be used not only as a tech- 
nical tool but as a systematic procedure for reconstructing the microscopic process {at}t>o 
for the purpose of model refinement or adaptivity since, as shown in Theorem 15.81 the re- 
construction is done under rigorous error estimates. In the estimates derived below we deal 
with a specific class of test functions <\> G L°°(Sn) which depend only on the coarse variable 
i] = Tcr, in other words we impose the assumption 

(Al) (j){a) = ip(Ta) , where ^ G L 00 (Sj^), and (5.8) 
\d x 4>(a)\ < C , where C is a constant independent of TV. (5.9) 

Remark: Observables, such as, for example, the total coverage, used in the numerical sim- 
ulations satisfy this assumption. 

The principal tool for analysing the weak error is its representation in terms of solutions 
to the final value problem on Sn 

d t v(t,cr) + £v(t,a)=0 7 v{T, .) = <£(•)> fori<T, 

where £ is a generator of the Markov semigroup that defines the lattice dynamics. Before we 
state the main estimate of the weak error and its proof we need several preliminary lemmata 
that characterize properties of the semigroup generated by the operator C defined by \2.1\ . 
The specific calculations are better presented by introducing an alternative notation for the 
generator C. We define an operator of discrete differentiation for functions / e L°°(Sn) 

d x f{a) = f(a x )-f(a), for all x G Ajv, (5.10) 

and we introduce two vectors indexed by the lattice sites x G A^v 

V CT /((T) = {d x f{cr)) xeAN , c(cr) = (c(x,cr)) xeAjv . 

The scalar product is defined in the natural way as c(cr) • V„f{a) = J2xeA N c ( x ' <7 )^/( <T )- 
Using this notation we write 



£/(<r) = c(cr) • V ct /(ct) , for all a G S N . 



(5.11) 
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The space of functions defined on the configuration space Sn is equipped with the strong L°° 
topology given by the norm 1 1 / 1 1 x = sup CT {/(a) }. 

To prove the estimate in Theorem l5.8l we need an estimate for the difference operator V CT 
stated here as a separate lemma. 

Lemma 5.5. Let v(t, a) be the solution of 

d t v + £v = 0, v(T, a) = $(&), fort <T, (5.12) 
on a given interval t <T, then 

\\dMt,-)\\oo<C T J2 IIMU- (5-13) 
Moreover, the constant C't depends exponentially on the final time T. 

Proof: Using the notation introduced above and the definition of C we recast the evolution 
equation J5.12t into a familiar form of a transport equation on the configuration space 



d t v + c(a) ■ S7 a v = , aES N ,t>0. (5.14) 
Subtracting d5 . 14-1 for v(t, a x ) and v(t, a) we have 

d t (v(t, a x )-v(t, a)) + c{a) ■ (X7 a v{t, a x ) - V a v(t, a)) + (c(a x ) - c(a))-V a v(t, a x )=0, 
which we write as 

d t (d x v(t,a))+c(a) ■ \7 a (d x v(t,a))+d x c(a) ■ V„v(t,cr x ) -0. (5.15) 

Next we derive L°° -bounds for the discrete derivatives d x c(a) using the explicit definition of 
the rates c(x, a) in J2.ll> . For each component, indexed by z G An, of the vector c(a) we 
have 

d x c(z,a) = c(z,a x )-c(z,a) = (l-a x (z))+a x (z)e- u{z '°* ) -{l-a{z)) + a(z)e~ u< - z - a) . 

For the spin-flip dynamics, i.e., a x (y) = 1 — a(y) if x = y and a x (y) = a(y) otherwise, a 
straightforward calculation gives d x ll(z, a) = U(z. a x ) — U(z, a) = J(z — x)(l — 2a(x)) 
if z 7^ x and it is equal zero otherwise. Thus the discrete derivate d x c(a) is 



{2a {x) - 1)(1 - e- ui - x ^) , foiz = x, 



d x c{z,a) 



Recalling the definition d2.3l > of the interaction potential J we have that J{z — x) ~ 1/L for 
\z — x\ < L and J = otherwise. Hence we derived L°°-bounds for the discrete derivative 
of the rates 

( 0(1), forz = a;, 
d x c{z,a)~\ 0(1/ L), fat\z-x\<L, (5.16) 
[ , otherwise. 

Going back to the equation ( I5.15t we have for all x E An 

d t (d x v{t,a))+Cd x v{t,a)+ 9 x c(z,a)d z v(t,a x )=Q. (5.17) 

zeA N 
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The estimates in d5 . 1 61 imply 

d t d x v(t 7 <j)+Cd x v{t,a) + 0(l)d x v(t,a x ) + (j) J2 d z v{t,a x )=Q, (5.18) 



z— x\<L 



and we have for all a £ Sn the solution formula 

d x v(t,a)=e t£ [d x v(0,tr)] + J T e^ c [0(l)d x v(s,a x ) + 0(l/L) £ d z v(s,a x )]ds . 

* \z—x\<L 

By the contractive property of the semigroup e tc we have the estimate 

\\d x v(t,-)\\oo < IMMIU+ / 0(l)||fl x «(« ) -)||oo<i8 + 



/ 0(1/L) £ OlUds, 



|z-x|<L 

for all a; S A^r . Thus summing over all x G Ajv we obtain 

R«(v)iu < E 11^-^(0, oiioo + 



x£An x£A 
rT 



+ / (O(i) £ IIM«>0lloo + o(i/L) £ £ H^OIU)^, 

where the last double sum in the integrand is bounded by 2LJ2 X \\d x v(s, -)||oo- Hence by 
setting 9{t) = J2 X \\^xv(t, -)||oo we have 

6{t) < 6(0) + / O(l)0(s)<2s, 



from which, by using Gronwall's inequality, we obtain the bound 

6{t) < e c(T - i) 6»(T) , 



which concludes the proof of J5.5I . 

Next we establish an L°°-bound for discrete derivatives of solutions generated by semi- 
groups e tc and e tc -~ < . 

Lemma 5.6. Let u(t, a) be the solution of 

d t u + Cu = 0, u(T, .) = <f>, fort<T, 

and let v(t, a) solves 

d t v + = , v(T, .)=i>, for t < T, 
then for any t < T the following estimate holds 

£ ||flS B u(t,0-flx«(*.OIIoo<Ci(T) R^-^V>l|oc+C 2 (T)(!) . (5.19) 
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The constants C\ and C2 are independent of q and L but depend exponentially on the final 
time T. 

Proof: We use the same approach and notation as in the proof of Lemma 1531 Subtract- 
ing the evolution equations and defining w x (t,a) = d x u(t,a) — d x v(t,a), w(t,a) = 
{w x (t,a)) xeAN we have 

d t w x (t, a) + Cw x (t,a) + (5.20) 

{c 1 {cj)~c{a))-V <J v{t,a x )+ (5.21) 

+d x c(a) ■ w(t,a x ) + (5.22) 

{d x c{a) - d x c^(a)) ■ Vav(t> ° X ) = • ( 5 - 23 ) 

From Lemma l5~5l we have estimates for the terms involving V a v(t, .) (notice that the lemma 
essentially gives the estimate of ||Vo-t;(i, .)||oo)- Furthermore, from the definition of rates 
c(x, a) and c 7 (x, a) direct calculation (similar to that used in the proof of Lemma l5.5> yields 
the estimate 

110-0,1100 = 0(1), (5.24) 

which allows us to control d5 .2 1 i and (I5.23> . Term ( I5.22> is treated in the same way as a 
similar term in the proof of Lemma l5~5l Hence, for all x 6 we obtain 

d t w x (t, a) + Cw x {t, a) + 0(1/ X) J] w x (z, a x ) < 0(q/L)\\d x v(t, .)||oo • 

\z-x\<L 

Similarly as in the proof of Lemma l5.5l we complete the proof by summing over x E An and 
applying Gronwall's inequality. 

Since we are comparing the process {at}t>o with the process {7t}t>o, which is defined 
only up to the equivalence given by the projection operator T we have to establish uniqueness 
of solutions for initial data satisfying the assumption (Al). 

Lemma 5.7. Let <j) £ L~°°(Sn), 4> G L 00 ^,) be test functions satisfying (Al). 
Assume that v(t, 7) is the solution of the final value problem 

d t v + /Pv = 0, u(T,7) = #7) = ^(T7), (5.25) 

then for all 7, 7' £ Sn such that T7 = T7' 

v(t, 7) = v(t, 7') , for all t < T. (5.26) 



Proof: For convenience we write v(t,j) = v(t,Tj). Given a configuration 7 £ Sn we 
can reconstruct an arbitrary configuration 7' e Sn such that T7' = T7 by considering a 
permutation tt : A^v — ► A^r, it — (wi, . . . , ttm ) such that 

7T fc :C k ^C k , k = l,...,M. 

The action of tt on the configuration space is defined in a natural way 7' = 7 o tt, or equiva- 
lent^ ^'(x) = 7(7rx). Since the permutation does not change the total spin in the cell we have 
T7 o 7r = T7. Hence we write v(t, 7') = v(t, 7 o tt) and v(T, 7 o tt) = v(T, 7) = ■^(T'y). 
It is sufficient to show that the function u(t, 7) = v(t, 7 o tt) is a solution of d5.25i . From the 
uniqueness of solutions to (I5.25> we conclude immediately that u(t, 7) = v(t, 7). From the 
definition of the generator £ 7 we have 

9tu(t,7 o tt) + c~f(x, 7 o Tr)(v(t, (7 o tt) x ) — v(t, 7 o tt)) = . (5.27) 
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Recall the definition of the rate c 7 

c 7 (x, 7) = do(l - 7(*)) + do 7 (x)e-^^' T ^ , 

and denote c 7 (x, 7) by Cy( 7 (ar), k, T7) to emphasise the dependence on j(x), k, and rj = 
T7 only. Thus the inner summation in J5.27I becomes 

C 7 ( 7 o tt, fc, Tj)(v(t, (7 o tt) 1 ) - v(t, 7 o tt)) . (5.28) 
On the other hand the definition of spin-flip dynamics leads to 

(707rTO = J while ( ™ )M r 7M 

w ' v ' [ 1- 7(7ra;) z = x , x ' y 1 — 7(772:) 2 = a; . 

(5.29) 

Hence we obtain 

(7 o tt) x (z) = 7^(772) = (7™ o tt) (z) , (5.30) 
and substituting to the expression J5.28l > leads to 

C 7 {^(wx),k, T 7 )(«(t, (7 o tt) x ) - v(t, 7 o tt)) = 

= £ G r (7(7ra),fc,T 7 )(»(t,7' r -o7r)-t;(t,7 7r)) = 



= ^ C 7 ( 7 ( 2 /),fc,T 7 )( W (t,7^o7r)- W (t,7 7r)) = 

yec k 

= £ a r (7(y),fc ) T 7 )(u(t,7»)-u(t,7)). 
yec fc 

Thus we have shown that 

ftu(*.7)+ E E c 7 (x,7)(w(t,7 a; )-^(^7)) = 0. 

fcGA^ x£C k 

Recalling the definition of u(t, 7) we obtain that v(t, joir) also solves (I5.25> . The uniqueness 
of solutions to J5.251 implies that v(t, ■yon) — v(t, 7 ) for all 7 or v(t, 7 ') = 7 ) for all 
7 ' such that T 7 ' = T 7 . 

Now we can formulate and prove the weak error estimate that allows us to compare the 
microscopic process and its coarse-level approximation. We estimate the weak error on the 
microscopic level by comparing the microscopic process and its synthetic process. 

THEOREM 5.8 (Weak error). Let 4> £ L°°(Sn) be a test function (observable) on the 
microscopic space satisfying (Al) and let ({ 7 t}t>o, £ 7 ) be the synthetic Markov process (in 
the sense of Definition ^. 4\ of the microscopic process ({<7t}t>o, £■) with the initial condition 
(To = S, then the weak error satisfies, for < T < 00, 

\E S [fax)] - E s [0(tt)]| < C T (I)' , (5.31) 
where the constant Ct is independent of q and L but depends on T. 
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Proof: The two ingredients of the proof, the Feynman-Kac formula and the martingale 
property, follow from the standard properties of Markov processes (see for example |26|). If 
we define, for the microscopic process {er t } t >o defined by the generator C, the function 

u(t, S)=E [<f>(a T ) \a t = S], 

then from the Feynman-Kac formula with the zero potential follows that the function u(t, S) 
solves the final value problem 



d t u + Cu = 0, u(T,.) 



t <T. 



(5.32) 



On the other hand the martingale property implies that for any smooth function v(t, S) 
and the process {jt}t>o with the generator £ 7 we have 

E s [v(T, 7T )] = E s [v(0, 7o )] + { E s [(d s + £>(s, %)} ds . 



The definition of u(t, S) leads to the representation of the error |Eg [</>(or)] — Es \4>{jt) 
by e w = |E S [u(0, S)] - E s [u(T, 7t )] | and hence 



E s [(d s +/P)u(s,j s )}ds . 
The function u(t, S) solves the equation <9 t u = — Cu thus we obtain 

E s [4>{<jt) - (Kvr)] = I % [£~<u(t, 7t ) - Cu(t, lt )]dt 



E.c 



dt. 



We split the summation J2xeA N which gives us 



E s [</>(<tt) ~ Hit)} = / E 



Es 



E E W^'Tt) - c^(x,"f t ))d x u{t^t) 
fceAs. xec k 



dt = 



Y E 7 t (^)(e^^ ) -e"^ fc ^^))(a^(i,T 7t ) + i?^(x)) 

k€A a , r iEC t 



dt. 



Here we need to replace d x u by the d x v, where v solves the final value problem j5.32t with 
C replaced by C . From Lemma l5~6l we know that the error term R^ L (x) = d x u(t,j) — 
d x v(t, 7) is controlled by 0(q/L) in 1 1 - 1 1 00 - Furthermore, Lemma l5~7l guarantees that with the 
final condition <p which satisfies Assumption (Al) the solution depends only on T7 and hence 
we can replace the discrete difference d x v by the difference dkv(t, 77) = v(t, i]+5k) —v(t, rj), 
where ?y = T7. Next we expand the exponentials to obtain 

r(fc, 7 )= J2 Pj(x)e- mHx) ^ } U(U,U) + ±13 2 A 2 (U,U) + 0(P 3 A 3 (U,U))) , 
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and we recast the error representation into 



0(7t)] 



o 



r(k, lt )d k v(t,T 7t ) + (c(x :lt )- Cj (x, lt ))R^ L (x) 



dt 



x6A f 



E s 



E, 



g 5] « fc t;(i J »j t )E[r(A,7)|T7 = » h 



dt 



f/f . 



Assumption (Al) and Lemma 1531 imply that the term qJ2keA c dkv(t,rjt) is bounded. To 
estimate the conditional expectation we use the property of the reconstruction operator for 
the process {jt}t>o, in particular on each cell jt ( x ) is reconstructed from r\ t (k) by assuming 
a "local" equilibrium and distributing Jt( x ) uniformly in the cell C^r x y Using this property 
we can compute the conditional expectation explicitly and we obtain for I ^ k 



E 



£ 7(z)A(Z7,C/)|T 7 = 



.x£C k 



VkVl 



(J(x -v)- Jki) = o . 



yeCi 



Similarly we handle the case I = k and we conclude that, after averaging, the first-order term 
A(U, U) in T(k, 7) vanishes. We recall (see J33J) that 



A(U, U) = U(k(x), T 7 ) - U(x, 7) = O (|) 



and hence we can estimate (15.331 by 0(q 2 /L 2 ). For the term J5.34I we use the estimate 



E 



q.L 



x£A N \ It T 



(x) I ~ 0(q/L) from Lemma l5~6l and the Holder inequality 



Ec 



X ( C ( X ^t) - C 1 {x,-lt))B q T L (x) 



.x£An 



< 



.xEAn 



The first term on the right-hand side is estimated from J5.24l > by C(q/L) and hence the left- 
hand side behaves as 0(q 2 /L 2 ). Combining the estimates of (15 . 3 3 i and J5.34i we conclude 
the proof. 

Using the estimate for the synthetic process and its reconstruction from the coarse- 
grained process {r]t}t>o we can compare the projected process {T<7t}t>o and the coarse- 
grained process {r]t}t>o also on the coarse level. The weak error for observables on the 
coarse space is also natural in simulations where we usually project finer simulations on the 
coarse level and use estimators for the coarse processes. 

COROLLARY 5.9. Let ip £ L oc (S'f iI ) be a test function on the coarse level such that 
there exists a test function cp £ L°°(Sn) satisfying (Al) with the property ip(Ta) = cj)(cr). 
Given the initial configuration a we define the coarse configuration r/ = Top. Assume the 
microscopic process ({at}t>o, C) with the initial condition <jq and the approximating coarse 
process ({?7t}t>o, £- c ) with the initial condition r]o — Too, then the weak error satisfies, for 
< T < 00, 



\E S [V(To- T )] -E s m VT )}\ < C T (|) 2 , 
where the constant Ct is independent of q and L but depends on T. 



(5.35) 



(5.33) 
(5.34) 
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6. Implementation of the coarse-grained Monte Carlo algorithms. The hierarchy of 
coarse-grained Monte Carlo processes (CGMC) parametrised by q has been designed in such 
a way that it is easily implemented in the unified manner. In fact, the nature of the generator 
C c at the level q allows us to use the same implementation as for the standard MC at the 
microscopic level, i.e., q = 1, 

The stochastic system is simulated with the kinetic Monte Carlo (KMC) algorithm. Each 
iteration of the Monte Carlo simulation produces a variable time step At within which a spin 
flip occurs at a specific lattice node based on the transition probability, 

[c a (k,T]) + c d (k,T])]At + 0{At 2 ) 

where c a and Cd are as in ( 13. 5\ . This procedure repeats until the stopping criteria (see below) 
have been met. More specifically, the simulation is implementing the following global up- 
dating process-type kinetic Monte Carlo (KMC) algorithm for spin flip Arrhenius dynamics: 
Step 1 Calculate all transition rates c a (k, rj) (adsorption), Cd(k, rj) (desorption) from (13.51 for 

all nodes k in the lattice A° M 
Step 2 Calculate the total R a — J2ieA c c a(h v)> Rd = J2ieA a c d{U v) adsorption, desorp- 
tion rates respectively. Similarly obtain the total rate Rt = R a + Rd- 
Step 3 Obtain two random numbers pi and p2- 

Step 4 Use the first random number to choose between absorption or desorption based on 
the measure created by the rates R a , Rd and Rt- Assume that the choice is to ad- 
sorb(desorb) and denote by c = c a (l, rf), (c d {l, f])) and R = R a , (Rd), respectively. 

Step 5 Find the node at lattice position I G A^ f such that, 

i f-i 

Y c 0' v) > piR > Y c (-?> ^ 

3=0 3=0 

Step 6 Update the time, t = t + At where 

At=l/R T . (6.1) 

Step 7 Repeat from Step 1 until equilibrium or dynamics of interest have been captured. 

As expected a kinetic Monte Carlo algorithm produces no "null" steps and therefore 
every trial is accepted. A similar version of the algorithm can also be implemented with a 
local updating mechanism which can improve speed substantially at the reciprocal expense 
of allocating further computer memory for dynamic array allocation. In the simulation that 
follow we use a finite size interaction potential and lattice size L, N < oo. 

We produce simulations and compare observables at microscopic (q = 1) and coarse 
grained (q > 1) levels. For consistency purposes we use the same seed for our random 
number generator in order to compare simulations for different coarse grained values of q. 
This allows us to focus on the differences attributed only to the coarse graining variable and 
not on those resulting from different paths due to the initial seed. In the case of several 
realisations we initialise each new microscopic realisation with a different seed. Once again, 
for comparison purposes, we initialise each subsequent coarse grained realisation with the 
same seeds used in the respective microscopic simulations. All simulations are compared 
in the same non-dimensional time units. The corresponding non-dimensional time-step is 
respectively set by the Monte Carlo simulation based on the rule 16. II 

In the simulations which follow we try, when possible, to group together various param- 
eters in the model so the results are presented with respect of variations in a smaller number 
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Coverage vs External Potential 




External Potential h 



FIG. 7.1. Equilibrium coverage c(<r) and its dependence on the external field. The critical point for the {0, 1} 
spins satisfies (3 c Jo = 4. The solid line depicts equilibria below the critical temperature. The hysteresis shape of 
the curve manifests existence of two equilibria in the neighbourhood of the zero external field. 



of parameters. In that respect we point out that for the fixed external field h it is possible to 
group together do and h in i3.5\ as follows, 

c d (k,r,) = cor,(k)e-^> J(*,iM0+J(o,o)(i,(*)-i)] 

where c = d e h . We provide the values of all pertinent parameters as well as do an d c o m 
the relevant figures. 

7. Numerical simulations. We use the CGMC described and analyzed in the previous 
sections for efficient simulations in the spin systems that undergo phase transitions. Within 
the context of spin-flip dynamics a typical example is nucleation of spatial regions of a new 
phase or a transition from one phase (all spins equal to zero) to another (all spins equal to 
one). In such simulations the emphasis is on the path-wise properties of the coarse-grained 
process so that the switching mechanism is simulated efficiently while approximation errors 
are controlled. We compare simulations on the microscopic level q = 1 with those performed 
on different levels of coarse-graining hierarchy parametrized by q. 

The qualitative behaviour of the Ising model with a long-range potential can be under- 
stood from the mean-field approximation of the equilibrium total coverage c(a). Below the 
critical temperature the Gibbs measure is not unique (in the thermodynamic limit N — » oo) 
and two phases can coexist. When the energy landscape is probed by changing the external 
field h we observe non-uniqueness of the equilibrium coverage as depicted in Figure l7~T1 The 
fluctuations allow for transitions between the equilibrium which leads to nucleation of regions 
with a different phase. Changing the external field h makes the original phase unstable and a 
switching occurs - the system transforms into the other equilibrium configuration. 

The parameters in the simulations have been chosen as follows: We use a uniform finite 
range potential for all examples presented. We simulate a finite lattice with a total of N = 
1000 microscopic nodes and allow a potential interaction range of 2L + 1 for L = 100. We 
choose the constant do = 1 so that c a = 1 and Cd = 1. Hence in this case the critical value 
P c is given by (3 C Jo = 4. If (3 Jo > (3 C Jo = 4 the system is in the phase transition regime and 
the two phases can coexist. In this region we typically observe a transition from one phase 
(e.g., zero (low) coverage) to the other phase (e.g., full coverage). For the phase transition 
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FIG. 7 .2. Relaxation dynamics. Comparison of microscopic (q = 1) and coarse grained (q = 10) simulations. 
The plot depicts a short time simulation in order to calibrate the code and compare to Figure 4 from [20\. 



Coverage in Time 




Microscopic q=1 

q=10 

q=20 
-— q=50 
q=1000 
P J G =S 

Nodes N=1000 
Pot Radius L=100 



2000 2500 3000 

Time (non-dim) 



FIG. 7.3. Time series of the coverage c^. Simulations for different coarse-graining ratios are shown in the 
phase transition regime. The case q = 1000, m = 1 (mean-field approximation) shows significant discrepancy. 
Parameters used: potential radius length L = 100, /3Jo = 6, do = 1, cq = .072 



examples we fix (3Jq = 6 > /3 c Jq- The simulations become difficult when j3 ~ (3 C and there 
is no external field h applied. We note that the coarse-graining algorithm will not perform 
well close to the critical point (3 C when h = 0. In the numerical studies we first investigate 
approximation properties of the CGMC algorithms for certain global quantities. 

Coverage: We define the coverage c t to be the process computed as the spatial mean 

We present time evolution of the coverage at the phase transition regime, f3 Jq = 6. Note 
that the case q = 1000, m = 1 which corresponds to the mean-field approximation ("over 
coarse-grained" interactions) does not follow the phase transition path of the other simula- 
tions. On the other hand the agreement in the results is extremely good for the remaining 
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Weak and Strong Errors and Convergence Rates Weak vs Strong Convergence Rates 




5.5 



10 20 25 50 100 ' 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 

Coarse Graining Level q Realizations 

FIG. 7.4. Estimated weak e w [c] and strong e s [c] errors. We compare the exact process c t , q = 1 with coarse 
approximations c^, q = 10, 25, 50 and 100. The simulation parameters were fixed at L = 100, do = 1, cq = .07, 
f3Jo = 6 > /3 C Jo and the lattice size N = 1000. The convergence rates depicted are estimated by the linear best 
fit on the logarithmic scale. The statistical error or dependence of the estimates on the number of realisations is 
depicted in the right figure. 



values of q. Furthermore, these numerical results indicate path-wise (strong) approxima- 
tion of the microscopic process by the coarse-grained process. This observation suggests a 
stronger error control than the relative entropy estimate provided by Proposition l5.2l 

To quantify the error behaviour we calculate two errors between the exact stochastic 
process c t and its coarse approximation c q t at the level of coarse-graining q. We define the 
weak error e w [c] and the strong error e s [c] respectively: 

|E[ct]-E[c?]|<ft, e s [c]= [ E[|Tct-cf|]£ft. 

Jo 

The expected values are estimated by empirical means and the integral in time by the piece- 
wise constant quadrature. 

The simulations allow us to estimate the convergence rate for both errors. The rates in 
the case of fixed parameters L = 100, do = 1.0, cq = 0.07 and /3Jq = 6 on the lattice of 
the size N = 1000 are depicted in Figure I7~4l Note that we need to eliminate the statistical 
error, arising from approximation of expected values by empirical means. However, as seen 
in Figure f7~4l the estimator of the rate converges as the number of realisations tends to infinity. 

Since the coarse-grained Hamiltonian neglects higher order corrections arising from the 
fluctuations on fine scales one may expect that the approximation is poor if q/L is not very 
small. This is certainly true at the critical point (i.e., j3 = f3 c and h = 0) but further from 
the critical point the approximation properties are improved. This is demonstrated in the 
following table, where the simulations were performed in the presence of different (large) 
external fields. The relative error becomes small even for fairly crude coarse-graining q = 20 
in the case of shorter interaction radii L. 

Mean time to reach phase transition: One quantity of interest that is calculated from the 
simulations is the mean time = E [tt] until the coverage reaches C + in its phase transition 
regime (see Figure f73l . The random exit time is defined as tt = inf{i > | c* > C + }. We 
estimate the probability distributions p T and p q T from the simulations. We record a phase 
transition at the time fx when the coverage exceeds the threshold value C + = 0.9. 
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Table 7.1 

Relative strong error e s [c] in the presence of an external field defined by co. Comparisons are made for 
different values of the interaction radius L and different coarse-graining levels q. Size of the lattice fixed at N = 
1000. 





co 


L 


q = 5 


q = 1Q q = 20 








100 


.0591 


.0733 


.1134 






.07 


40 


.0820 


.0880 


.1113 








20 


.1508 


.2214 


.1832 








100 


.0186 


.0563 


.0480 






.09 


40 


.0678 


.0749 


.1064 








20 


.1760 


.1767 


.1812 








100 


.0010 


.0010 


.0025 






1 


40 


.0036 


.0040 


.0054 








20 


.0016 


.0043 


.0065 










Table 7.2 






Approximation of Tt> Ti- \P- 


- T*p r ) and relative error. 


L 


q 




Rel. Err. CPU [s] 


100 


1 532 




0.0 





309647 


100 


2 532 




0.003 


0.01% 


132143 


100 


4 530 




0.001 


0.22% 


86449 


100 


5 534 




0.003 


0.38% 


58412 


100 


10 536 




0.004 


0.82% 


38344 


100 


20 550 




0.007 


3.42% 


16215 


100 


25 558 




0.010 


4.91% 


7574 


100 


50 626 




0.009 


17.69% 


4577 


100 100 945 




0.087 


77.73% 


345 



In Figure lTBl we plot approximations of the Probability Density Functions (PDFs) of tt 
and compare them for different values of q. 

The qualitative agreement observed in Figure f7~5l is quantified by using the information 
distance for error estimation, i.e., by estimating the relative entropy 

^(pi|p 2 ) = £M A ) lo s(^) • (7.D 

A 

Nucleation: The nucleation of a new phase is a typical phenomenon in the regime where 
[3 > [3 C . Essentially, there exist two equilibria (phases). Random fluctuations will induce 
transitions from one state to another by overcoming energy barriers that separate the equilib- 
ria. We investigate approximation of the path-wise behaviour on the configuration space for 
nucleation of a new phase. Two different initial configurations are used. 
Test CASE I: The initial state is at the metastable equilibrium where the coverage is zero. 
The fluctuations will cause the transition to the full coverage equilibrium which is stable 
due to the external applied field. We present only qualitative comparison in the series of 
snap-shots (Figure f7~8l of the phase transition from the uniform (zero) initial coverage to 
the full coverage. We observe a striking path-wise agreement on the configuration space for 
relatively large values of q compared to the interaction radius L. However, as the ratio q/L 
increases the corresponding coarse-grained process lags behind which is also demonstrated in 
the expected values of transition times. Such behaviour suggests that fluctuations at regions 
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PDFs for q=1, 10, 20,50 



■ q=1,mean = 


537 


■ q=10, mean 


=543 


q=20 mean 


=555 


■ q=50 mean 


=630 


Cont. PDF 





500 1000 1500 2000 2500 3000 3500 

Non-dim time 

FIG. 7.5. Probability Density Function (PDFs) comparisons between different coarse-graining levels q. The 
estimated mean times for each PDF are shown in the figures. All PDFs comprised of 10000 samples and the 
histogram is approximated by 100 bins. 




3.8 5.8 8.6 
CPU time in sec x 1000 



FIG. 7.6. 77ie dependence of the relative error and the relative entropy on the coarse-graining level q. The left 
scale on the vertical axis depicts the relative error and q on the log-log scale. Measurements based on averaging 
over 10000 realisations for each q. 



with uniform states are well-approximated by a highly coarse-grained process while finer 
resolution is necessary for resolving nucleation of new phases through islands. 
Test Case II: We have already documented the path-wise agreement of the approximating 
dynamics under both transition and relaxation cases. In this example we examine the nucle- 
ation phenomenon at the critical parameter regime of phase transition (3 Jq = 6. We chose 
the initial state to be at a saddle point of the energy surface, i.e., the mean coverage is set to 
0.5. Snapshots of the spatial distribution of spins are presented in Figure lTTl Under all four 
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dynamics examined q — 1,5, 10 and 20 we observe complete spatial path-wise agreement. 
Over time the total coverage may fall towards zero or rise towards one in which case it will 
remain there since we are at the phase transition regime where these represent stable equilib- 
ria. Furthermore such spatial examples of nucleation are shown below in Test Case III under 
the assumption of an "island-type" of initial state. 




FIG. 7.7. Snap-shots of the transition from the initial state with the mean coverage at 0.5. Comparisons 
between the microscopic 9=1 and coarse grained simulations q = 5, 10 and q = 20. The interaction radius is set 
to L = 100, the external field cq = 0.0492, do = 1 and the total number of lattice sites N = 1000. 



Test Case III: The last set of simulations presents evolution from the non-uniform initial 
state, giving a qualitative comparison of nucleation from an island of a given size (Figure[^J. 
In these simulations we observe spatial propagation of the interface in time for different initial 
size of the island. 
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Spatial Comparisons of Phase Transition Spatial Comparisons of Phase Transition 
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FIG. 7.8. Snap-shots of the transition from zero initial spatial distribution. Comparisons between the micro- 
scopic q = 1 and two coarse grained simulations q = 10 and q = 50. The interaction radius is set to L = 200 
while total nodes are N = 10000. 
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